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vThis  thesis  describes  the  design  and  implementation  of  a  language- 
independent  reuseable  code  generator  for  Prime  400  and  50-Series  computers. 

A  code  generator  is  the  portion  of  a  compiler  that  converts  an 
internal  representation  of  the  semantics  of  a  program  into  equivalent 
machine  code.  Construction  of  a  code  generator  requires  a  major  effort,  so 
it  should  be  tone  as  Infrequently  as  possible.  One  way  to  make  this  pos¬ 
sible  is  to  build  a  code  generator  that  may  be  re-used  from  compiler  to 
compiler. 

Several  factors  influence  the  design  of  such  a  code  generator, 
including  the  nature  of  the  comnunicatlons  channel  between  the  code 
generator  and  the  rest  of  the  compiler,  the  structure  of  the  information 
passed  to  the  code  generator  (the  "intermediate  form"),  the  form  of  output 
code  desired,  and  finally  the  limitations  of  the  machine  architecture  and 
existing  systems  software.  ■, 

The  code  generator  implemented  processes  a  high-level,  tree- 
structured  intermediate  form,  performing  translation  by  case  analysis  and 
optimization  by  eliminating  redundant  load  operations.  It  produces  a 
stream  of  assembly  language  source  code  which  may  then  be  assembled, 
loaded,  and  executed.  - 

(  Experience  with  two  compiler  implementations  has  shown  that  the  re¬ 
usable  code  generator  approach  is  feasible.  However,  several  improvements 
in  the  present  code  generator  would  be  desirable. v 
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CHAPTER  1 
Introduction 


1.1  HQttMUQn 

The  School  of  Information  and  Computer  Science  maintains  a  network  of 
five  medium-scale  Prime  computers  for  both  academic  and  research 

activities.  Despite  the  natural  growth  in  "Prime  expertise"  resulting  from 
five  years  of  use,  there  has  been  no  successful  local  compiler 

implementation.  Hie  few  attempts  that  have  been  made  were  ymled  by 
various  difficulties  in  dealing  with  the  machine  architecture  and  systems 
software. 

Nevertheless,  there  are  several  reasons  for  undertaking  further  com¬ 
piler  implementation  projects  on  the  School's  Prime  computers: 

.  Both  ICS  and  the  Prime-using  segment  of  the  business  community  are 
interested  in  the  C  programing  language  [Kernighan  1978].  A  wealth 
of  software  exists  in  the  form  of  C  programs;  a  C  compiler  on  the 
Primes  would  thus  enhance  their  usability  and  effectiveness. 

.  Existing  compilers  for  Prime  computers  are  large  programs  that 
seriously  impair  system  throughput  when  they  are  run.  For  example, 
the  PL/I  subset  G  compiler  processes  programs  at  approximately  500 
lines  per  CPU  minute,  referencing  384K  bytes  of  shared  code  and  256K 
bytes  of  private  data  space.  Three  concurrent  PL/I  compilations 
cause  excessive  paging  on  a  Prime  550  with  1.5  megabytes  of  main 
memory,  pushing  system  response  time  to  the  point  of  user 
frustration,  even  for  simple  operations  like  logging  in.  In  an 
academic  environment  where  compilations  are  frequent,  this  is  unac¬ 
ceptable.  Local  replacements  for  existing  compilers  could  improve 
system  throughput  as  well  as  offer  useful  new  features. 

.  Computer  science  courses,  like  those  in  physics  or  electrical 
engineering,  use  laboratories  to  provide  students  with  "hands-on" 
experience.  ICS  majors  need  access  to  lare,e  software  systems  like 
compilers,  but  cctrmercial  compilers  are  frequently  1 naccessable  for 
legal  reasons.  Locally-developed  "training  compilers"  could  meet 
the  need. 
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.  Resear :h  projects  within  the  School  occasionally  require  language 
translators  for  special  applications.  For  example,  researchers  in 
the  fields  of  artificial  intelligence  and  fully  distributed  data 
processing  have  identified  needs  for  programing  languages  that  ere 
not  implemented  on  the  Primes. 

1 .2  Ccmnl  I t»r  Design 

To  see  hew  a  re-usable  code  generator  can  help  to  meet  the  School's 
needs,  it  is  necessary  to  consider  current  compiler  design  practices. 

In  what  has  cane  to  be  the  "classic"  scheme,  a  compiler  is  composed 
of  four  parts: 

.  The  lexical  analyzer  converts  a  stream  of  characters  supplied  by  the 
user  into  higher  level  "tokens."  '.is  process  is  analogous  to  the 
way  a  person  groups  written  letters  to  form  words. 

.  The  syntax  analyzer  groups  the  tokens  produced  by  the  lexical 
analyzer  into  structures,  according  to  the  rules  of  a  grammar.  In  a 
similar  vein,  a  person  groups  words  to  form  phrases  and  sentences. 

.  The  semantic  analyzer  extracts  the  "meaning"  of  the  structures 
produced  by  the  syntax  analyzer.  In  the  case  of  people,  sentences 
are  "interpreted." 

.  The  code  generator  in  a  sense  Inverts  the  preceding  processes:  it 
synthesizes  a  sequence  of  instructions  that  fleets  the  lexical  and 
syntactic  requirements  of  a  computer's  machine  language  while  insur¬ 
ing  that  the  sequence  is  "semantically  equivalent"  to  the  original 
program.  As  with  a  hunan  translator,  it  is  essential  that  the  code 
generator  have  an  excellent  command  of  the  language  into  which  it  is 
translating.  Otherwise,  there  is  much  less  incentive  to  use  a  com¬ 
piler;  it  might  be  more  economical  to  produce  machine  code  by  hand. 

The  machine  instructions  produced  by  the  code  generator  are  interpreted  by 
a  computer  to  perform  the  task  expressed  by  the  original  program. 


1.3  Pragma  t. -1  c* 

Today,  lexical  and  syntax  analysis  are  well-understood;  a  large  body 
of  theoretic*!  results  has  made  it  possible  to  automate  the  construction  of 
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lexical  and  syntax  analyzers  (see  for  example  [Aho  1972]),  Semantic 
analysis  is  more  complex  and  consequently  less  well-understood.  Code 
generation  is  in  a  similar  state;  the  best  automatic  code  generation 
algorithms  are  little  better  than  heuristically  controlled  searches 
[Graham,  1980], 

For  economic  reasons,  it  is  desirable  to  minimize  the  amount  of 
"custom-crafted"  single-use  code  in  a  given  compiler  implementation.  Com¬ 
pilers  are  large,  complex  pieces  of  software,  and  writing  one  is  an 
expensive  task.  However,  if  portions  of  a  compiler  may  be  re-used  in  sub¬ 
sequent  implementations,  the  total  amortized  cost  of  the  compiler  can  be 
reduced. 

In  large  measure,  the  lexical,  syntax,  and  semantic  analysis 
portions  of  a  compiler  for  language  X  can  be  made  independent  of  the 
machine  on  which  X  is  to  run.  Similarly,  the  code  generation  portion  of  a 
compiler  can  be  made  largely  independent  of  the  language  being  compiled. 
In  theory,  one  would  have  a  single  "front  end"  for  each  language  to  be  com¬ 
piled,  and  a  single  "back  end"  for  each  target  machine.  In  this  way,  a 
compiler  for  any  language  would  be  available  for  any  machine;  it  would  only 
be  necessary  to  connect  the  appropriate  front  end  to  the  proper  back  end. 
This  scheme  maximizes  re-usability  of  compiler  implementation  code,  thus 
minimizing  cost.  Unfortunately,  it  is  not  possible  in  practice;  there  are 
significant  differences  between  languages  and  between  target  machines  which 
make  such  an  approach  infeasible.  However,  it  is  possible  to  make  sub¬ 
stantial  progress  toward  the  goal  of  re-usability  with  a  practical 
implementation  of  a  back  end  for  one  machine. 

A  re-usable  code  generator  for  Prime  hardware  is  desirable  for 
another  reason.  The  Prime  architecture  (relevant  aspects  of  which  will  be 
discussed  in  more  detail  below)  suffers  from  a  nunber  of  shortcomings  that 
make  it  inhospitable  to  high-level  language  compilers.  For  example,  a  num¬ 
ber  of  operations  are  simply  missing  from  the  instruction  set;  it  is  pos¬ 
sible  to  do  32  bit  wide  logical  "and"  operations,  but  there  is  no  32  bit 
logical  "or"  instruction.  The  available  addressing  formats  make  it  neces¬ 
sary  to  reference  separately-compiled  objects  with  Indirect  addresses,  thus 
requiring  knowledge  of  external  objects  at  compile  time  as  well  as  forcing 
the  use  of  different  addressing  techniques.  The  method  of  manory  segraenta- 
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tion  is  closely  connected  to  the  implementation  of  several  instructions, 
causing  array  indexing  to  fail  when  an  array  overlaps  a  boundary  between 
memory  segments.  For  these  and  other  reasons,  code  generation  for  the 
Primes  is  particularly  difficult.  It  seems  best  to  invest  the  effort  in 
building  a  code  generator  just  once,  making  it  re-usable  for  future  com¬ 
piler  implementations. 

The  ICS  Prime  computer  systems  support  a  number  of  "software  tools" 
designed  to  simplify  the  processes  of  lexical  and  syntactic  analysis. 
Unfortunately,  there  is  no  analogous  support  for  code  generation. 

The  central  problem  of  this  thesis  may  be  stated  as  follows:  Design 
and  implement  a  code  generator  for  Prime  computers.  The  code  generator 
must  present  an  interface  that  may  be  used  by  a  variety  of  front-end 
language  and  processors.  Furthermore,  the  interface  should  depend  on 
features  of  the  underlying  machine  architecture  as  little  as  possible. 

The  code  generator  should  produce  instructions  that  are  carnnon  to 
all  the  machines  in  the  ICS  Computing  Laboratory  Prime  complex.  The  code 
generator  should  be  "fast,"  at  least  in  comparison  to  the  code  generation 
phases  of  compilers  that  are  already  available.  It  should  produce  machine 
code  programs  of  high  enough  quality  that  there  is  little  temptation  to  use 
an  existing  compiler  or  to  write  programs  directly  in  machine  code  when 
code  efficiency  is  the  major  issue. 

1.5  Related  Efforts 

A  number  of  other  efforts  have  influenced  the  direction  of  this 
thesis : 

.  The  author's  senior  design  project  at  Georgia  Tech  [Akin  1979] 
involved,  among  other  things,  the  construction  of  a  compiler  for  a 
microcomputer  systems  programming  language.  The  need  for  identical 
source  programs  to  run  on  two  different  microcomputers  led  to  the 
factoring  of  the  compiler  into  a  machine  independent  front  end  and 
two  machine  dependent  back  ends.  Experience  with  the  interface 
between  segments  of  the  compiler  strongly  affected  the  design  of  the 
interface  for  the  present  code  generator. 
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•  Stephen  Johnson's  portable  C  compiler  [Johnson  1979]  shows  a  succes¬ 
sful  approach  to  code  generation  that  differs  from  the  one  taken  in 
this  project:  "pcc"  uses  a  machine  independent  code  generator  with 
tables  that  are  tailored  by  end-users  for  particular  machines. 
However,  Johnson's  ideas  on  oode  optimization  were  used  in  the 
current  effort  without  much  charge. 

.  The  Charrette  Ada  Compiler  project  [Lamb  1980]  provided  insight  into 
the  problems  of  developing  a  truly  language-independent  intermediate 
form  (interface  between  front  end  and  code  generator).  In  the 
Charrette  project,  the  front  end  produced  a  tree-structured 

intermediate  form  known  as  T COL- Ada.  The  success  of  TCOL-Ada  was  an 
important  factor  in  the  decision  to  use  a  tree-structured 

intermediate  form  in  this  project.  (Ada  is  a  trademark  of  the  U. 
S.  Department  of  Defense.) 
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CHAPTER  2 

Design  Considerations 

The  design  of  the  code  generator  was  largely  driven  by  environmental  fac¬ 
tors:  existing  means  of  ccmnunicaU  on  between  front  end  and  code 

generator,  machine  code  file  formats,  system  software  limitations,  and 
constraints  imposed  by  the  machine  architecture.  Fortunately,  there  were  a 
few  degrees  of  freedom  in  the  design,  particularly  in  the  format  of  the 
intermediate  code  used  for  communication  between  front  ends  and  the  code 
generator. 

2.1  riflMunlcitlQDa  Chinna! 

Given  that  compilers  using  the  code  generator  will  be  composed  of  a 
front  end  (lexical/ syntactic/semantic  analysis)  and  a  back  end  (code 
generation),  how  should  communication  between  the  two  components  be 
arranged?  There  are  several  points  to  consider: 

.  The  amount  of  information  passed  from  front  end  to  back  end  varies 
with  the  size  of  the  source  code  program,  and  (as  will  be  seen 
below)  the  entire  source  program  must  be  processed  by  the  front  end 
before  code  generation  can  begin.  Therefore,  no  assumptions  can  be 
made  about  limiting  the  amount  of  intermediate  code;  the  com¬ 
munications  medium  must  be  capable  of  queueing  a  large  amount  of 
data. 

.  Both  the  medium  and  the  encoding  of  the  intermediate  information 
should  be  independent  of  source  language  and  target  machine. 

.  The  programming  methodology  described  in  Software  Tools  [Kernighan 
1976]  has  been  incorporated  in  the  Software  Tools  Subsystem  [Akin 
1980]  running  on  the  ICS  Prime  computers.  The  Subsystem  provides 
significant  advantages  to  users  willing  to  follow  certain  con¬ 
ventions  for  inter-program  comnunication. 

The  requirement  for  a  communications  medium  of  unbounded  size 
clearly  indicates  the  need  for  some  sort  of  file  on  mass  storage.  The 
Software  Tools  Subsystem  particularly  encourages  files  of  textual  data 
(ASCII  characters).  The  requirement  for  language  and  machine  independence 
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favors  the  use  of  character  representation  of  integers,  which  can  be 
produced  by  virtually  all  I/O  support  systems  and  which  port  easily  from 
machine  u»  machine.  (This  has  the  advantage  that,  during  debugging,  the 
compiler  writer  can  view  or  edit  the  output  of  his  front  end  without  having 
to  code  special  tools  for  the  purpose;  unfortunately,  the  conversion  to  and 
from  the  textual  representation  slows  down  code  generation. )  Therefore  the 
communications  channel  should  be  one  or  more  temporary  text  files,  and  the 
encoding  technique  will  be  conversion  to  character  representation  of 
integers. 

2.2  Intermediate  £fim 

The  intermediate  form  (IMF)  is  the  language  used  by  the  front  end  to 
communicate  the  semantics  of  a  compiled  program  to  the  code  generator.  It 
may  be  considered  the  instruction  set  of  a  "virtual”  computer,  in  which 
case  the  code  generator  is  best  viewed  as  a  translator  of  virtual  machine 
instructions  into  actual  machine  instructions. 

The  design  of  the  IMF  breaks  down  into  three  parts:  the  selection 
of  "operators"  (virtual  machine  instructions),  the  definition  of  the 
primitive  data  types  on  which  the  operators  are  used,  and  the  selection  of 
a  structure  in  which  the  operators  are  imbedded. 

2.2.1  Operatoca 

The  choice  of  IMF  operators  is  essentially  unconstrained,  but  a  number 
of  relevant  observations  may  be  drawn  from  experience: 

.  Oper  tors  may  be  low-level  (close  to  actual  machine  instructions)  or 
high-level  (more  abstract,  closer  to  typical  programming  language 
operations) , 

.  Higher-level  operators  provide  more  context  information,  allowing 
more  straightforward  translation  to  efficient  machine  instructions. 
For  example,  a  "range  check"  operator  must  be  implemented  with  two 
ccmpare-and-skip  tests  on  the  Prime.  There  is  a  clever  way  of 
interlacing  the  two  tests  which  is  valuable  for  range  testing  but 
practically  useless  for  combining  two  tests  in  the  general  case. 
The  use  of  a  range  check  operator  in  the  IMF  allows  the  code 
generator  to  produce  efficient  code  for  range  checking  without  wast- 
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ing  time  trying  to  optimize  the  more  comnon  general  case  (two  tests 
in  a  row). 

.  High-level  IMF  operators  may  simplify  code  generation  algorithms. 
Fot  example,  the  presence  of  an  "if-statement"  operator  might 
guarantee  the  code  generator  that  control  enters  statements  in  the 
"else-part"  of  the  "if"  from  only  one  point.  This  would  allcw  trac¬ 
king  of  register  contents  across  the  basic  block  boundary  at  the 
beginning  of  the  "else-part."  Without  the  "if-statement"  operator, 
it  might  be  necessary  to  construct  a  complete  program  flow  graph  to 
get  the  same  information. 

.  If  an  IMF  operator  is  conceptually  similar  to  a  high-level  language 
construct,  that  construct  is  easily  translated  by  simply  generating 
the  IMF  operator. 

2.2.2  Data  Types 

IMF  operators  express  data  manipulations  and  abstractions  of  control 
flow.  Additional  information  is  required  to  describe  the  data  that  is  to 
be  manipulated.  The  situation  is  complicated  by  inherently  machine- 
dependent  data  definitions  that  are  available  in  languages  like  C  and  Ada. 

In  the  present  work,  this  issue  was  addressed  by  parameterizing  the 
types  of  data  handled  by  the  IMF  "virtual  machine."  This  allows  machine 
dependencies  in  data  description  to  be  restricted  to  fairly  small  parts  of 
the  front  end. 

2.2.3  Structure 

[Gries  1971]  discusses  a  variety  of  structures  for  intermediate  forms: 
triples,  indirect  triples,  quadruples,  Polish  notation,  etc.  The  tree 
structure  selected  for  this  project  has  a  number  of  advantages: 

.  Trees  are  easily  generated  during  top-down  or  bottcm-up  parses. 

.  When  expressions  are  represented  as  trees,  there  is  no  need  for  the 
front  end  to  handle  allocation  of  temporary  variables. 

.  Trees  are  easily  linearized  by  converting  them  to  Polish  notation. 
Thus,  they  meet  the  requirements  of  the  sequential  conmunications 
channel  between  the  front  end  and  the  code  generator. 
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.  Tree  formats  are  flexible;  for  instance,  operators  with  varying  num¬ 
bers  of  operands  are  easily  accommodated. 

.  Many  algorithms  related  to  code  generation  are  expressed  in  terms  of 
operations  on  graphs;  constant  folding,  operand  reordering,  common 
subexpression  elimination,  and  global  register  tracking  are  exam¬ 
ples.  These  algorithms  may  often  be  applied  to  the  tree-structured 
IMF  directly. 

2.2.4  Results 

The  intermediate  form  devised  for  this  project  is  tree-structured, 
with  about  70  operators  and  seven  primitive  data  types  (see  Appendix  A). 
It  defines  an  expression-oriented  virtual  machine  language  with  sufficient 
power  to  support  progranming  languages  on  the  level  of  C  or  Pascal.  For  a 
simple  example  of  the  intermediate  form,  see  Appendix  B;  for  a  tutorial  and 
a  complete  set  of  examples,  see  [Akin  1981]. 

2.3  Output 

There  are  two  alternative  formats  of  code  generator  output:  object 
code  and  assembly  language  source  code. 

Object  code  is  compact,  and  comparatively  quick  to  produce  on  most 
machines.  On  the  Primes,  unfortunately,  object  code  formats  are  extremely 
complex.  Furthermore,  Prime  has  scheduled  changes  to  its  object  code 
formats  in  the  near  future,  so  use  of  the  current  formats  would  guarantee 
quick  obsolescence  of  the  code  generator. 

Assembly  language  source  code  is  bulky  and  therefore  incurs  extra 
overhead  in  production.  In  addition,  the  assembler  must  be  invoked  to 
produce  the  final  object  file.  Prime's  assembler  uses  a  three-pass 
algorithm;  the  first  pass  essentially  does  nothing  but  recoup  information 
that  was  available  to  the  code  generator  but  was,  of  .jecessity,  lost  in  the 
translation  to  assembly  source.  Thus,  the  time  required  for  this  first 
pass  is  simply  wasted.  However,  use  of  the  assembler  insulates  the  code 
generator  from  details  of  the  object  code  format. 

In  the  final  analysis,  there  was  no  choicer  the  planned  changes  to 
Prime's  object  code  formats  make  assembly  language  output  the  only  viable 
option.  Accordingly,  the  code  generator  was  designed  to  produce  soiree 
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code  for  a  final  pass  by  the  assembler. 

2.4  fivatifflll  Software  UaitatiOM 

A  few  code  generator  features  were  mandated  by  the  limitations  of 
Prime’s  system  software. 

For  example,  the  names  of  all  entry  points  (typically  procedure 
names)  must  be  listed  at  the  beginning  of  the  object  code  module  in  which 
they  are  defined.  This  implies  that  the  names  of  all  procedures  must  be 
known  before  any  code  can  be  generated  in  a  given  module.  One  approach, 
ruled  out  because  of  its  slowness,  would  be  to  scan  the  entire  stream  of 
IMF,  picking  out  procedure  names  and  generating  entry  point  declarations 
for  them.  The  approach  actually  taken  requires  the  front  end  to  generate 
another  stream  of  IMF  containing  procedure  names  as  it  makes  the  first  pass 
over  the  source  program.  The  code  generator  reads  this  stream  first, 
produces  the  list  of  entry  points,  then  reads  the  main  stream  and  generates 
code. 

A  major  problem  with  maximum  program  size  follows  from  the  dec?  ,*on 
to  use  Prime’s  assembler.  Despite  the  256  megawords  of  address  space 
available  to  each  user,  the  assembler  cannot  handle  a  module  any  larger 
than  65,536  words.  Although  this  is  a  disadvantage  from  the  user's  point 
of  view,  it  simplifies  the  code  generator  since  16  bit  arithmetic  is 
sufficient  to  calculate  any  address  within  a  module. 

2.5  Machine  Architecture 

The  Prime  P400/P550  architecture  ([Prime  19793)  imposed  a  number  of 
constraints  on  code  generator  functionality. 

Each  Prime  computer  supports  a  number  of  "addressing  modes’’ 
(actually  different  instruction  sets)  in  order  to  maintain  compatibility 
with  earlier  Prime  product  lines.  The  code  generator's  target  addressing 
mode  is  "64V  mode,"  the  only  addressing  mode  coonon  to  all  Prime  machines 
in  the  ICS  laboratory  that  is  capable  of  addressing  more  than  64K  words  of 
data.  64V  mode  provides  an  accumulator-based  instruction  set  and  segmented 
virtual  memory. 

In  64V  mode,  only  the  memory  segments  referenced  by  three  base 
registers  are  readily  accessable.  The  "stack  frame"  referenced  by  register 
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Go  contains  t’.r  activation  record  for  the  last  procedure  invent*:'',  the  'link 
frame"  referenced  by  register  LB  contains  static  and  global  variables  and 
non-reentrant  code,  and  the  "procedure  frame"  referenced  by  register  PB 
contrins  the  reentrant  code  of  the  currently-executing  procedure.  Neither 
the  stack  frame  nor  the  link  frame  can  occupy  more  than  one  memory  segment, 
so  it  is  safe  fer  the  code  generator  to  use  hardware  indexing,  which  fails 
on  multi-segment  data  structures. 

The  accumulator-based  architecture  of  the  64V  mode  instruction  set 
has  an  advantage:  there  is  only  one  register  for  each  of  the  primitive 
data  types  (integer,  floating  point,  etc.),  so  a  very  simple  register 
management  algorithm  suffices.  Unfortunately,  some  of  the  registers 
physically  overlap,  so  problems  do  arise  occasionally. 

Most  64V  mode  instructions  cannot  directly  address  all  locations  in 
memory.  Typically,  instructions  are  restricted  to  a  small  local  address 
range  and  must  use  indirect  addressing  to  reference  memory  outside  that 
range.  Since  the  addresses  of  external  objects  cannot  be  known  at  compile 
time,  it  must  be  assuned  that  they  will  lie  out  of  the  local  address  range 
and  thus  must  be  addressed  indirectly.  Xt  is  frequently  the  case  (e.g. , 
with  procedure  calls)  that  an  object  must  be  referenced  before  it  is  known 
to  be  internal  or  external,  leaving  the  code  generator  with  a  difficult 
decision:  should  direct  addressing  be  used?  There  are  three  possible 
approaches:  (1)  always  use  indirect  addressing,  thereby  adding 
considerable  unnecessary  overhead  to  internal  object  references;  (2)  scan 
the  entire  IMF  stream  and  determine  which  objects  are  external,  then 
generate  indirect  addresses  for  those  objects  only;  (3)  require  the  front 
end  to  supply  another  stream  of  IMF  listing  externally  defined  objects.  As 
in  the  case  of  the  entry  points  described  above,  the  most  viable  solution 
is  to  require  the  additional  stream  of  IMF. 
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CHAPTER  3 

Implementation  Overview 

"'he  implementation  of  the  code  generator  proceeded  in  four  steps: 
identification  and  selection  of  IMF  operators,  hand  generation  of  machine 
code  sequences  for  those  operators,  development  of  case-analysis  algorithms 
to  select  proper  code  sequences  for  generation,  and  development  of  simple 
optimization  algorithms. 

3.1  jbe  Operator  Selection 

The  final  set  of  operators  is  semantically  very  close  to  the 
operations  of  the  language  C  [Kernighan  1978],  since  one  of  the  initial 
purposes  of  the  code  generator  was  to  support  a  C  compiler.  Several 
operators  were  added  to  support  Pascal-like  operations,  particularly  range 
checking,  and  to  provide  "escape  hatches"  for  calls  to  run-time  support 
routines.  All  operators  were  subjected  to  examination  based  on  the 
criteria  discussed  in  Chapter  II  before  selection.  The  complete  set  of 
operators  Is  listed  in  Appendix  A. 

3.2  .Code  Sequence  Selection 

According  to  the  hypothesis  advanced  in  Chapter  II,  the  use  of  high- 
level  IMF  operators  should  contribute  to  tho  quality  of  the  output  code, 
since  the  additional  information  supplied  by  the  operators  contributes  to 
selection  of  special  cases.  The  next  Implementation  step  was  thus  to 
develop,  by  hand,  the  sequences  of  machine  code  that  should  be  generated 
for  each  operator  on  each  type  of  operand  in  each  context  in  which  the 
operator  could  legally  appear. 

The  selection  of  code  sequences  began  with  hand-coding  by  the  author 
and  continued  with  several  iterations  of  examination  and  improvement  by  the 
author  and  two  expert  assembly  language  programners.  Several  criteria 
guided  the  code  sequence  selection  process: 

.  Execution  time  should  be  minimized.  This  usually  involved  careful 
study  of  the  64V  mode  instruction  timings,  searching  for  faster 
alternative  code  sequences. 

.  Code  size  should  be  minimized.  Where  two  alternatives  were  equally 
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fast  or  not  readily  compared  in  speed,  the  smaller  sequence  was 
preferred. 

.  Memory  references  should  be  minimized.  In  practice  it  has  often 
been  found  that  a  code  sequence  that  i3  theoretically  faster  tun.s 
out  to  be  inferior  because  it  involves  two  single-word  fetches, 
rather  than  an  interleaved  double-word  fetch. 

.  Since  the  Primes  buffer  access  to  main  memory  with  a  high-speed 
cache,  the  nunber  of  memory  references  that  can  be  satisfied  from 
cache  storage  should  be  maximized.  As  in  the  case  above,  unexpected 
irregularities  in  execution  times  arise  because  of  the  pattern  of 
ac< esses  to  the  cache  memory.  If  at  all  possible,  references  to  a 
single  memory  location  should  be  placed  temporally  close  together, 
to  maximize  the  likelihood  of  finding  the  contents  of  that  location 
in  the  cache. 

.  Generation  of  "overhead  code"  (like  loading  the  auxiliary  base 
register  to  access  seme  location  in  memory)  should  be  avoided  as 
long  as  possible.  In  many  cases  the  extra  instructions  are  subsumed 
by  the  addressing  modes  used  in  subsequent  instructions. 
Unfortunately,  this  guideline  fails  in  certain  cases,  particularly 
when  code  motion  optimizations  might  remove  the  loading  of  the 
auxiliary  base  register  from  a  loop. 

.  Code  sequences  should  be  matched  to  their  most  common  usages.  For 
example,  most  alternatives  in  multiway  branches  are  selected  by  case 
label  values  that  form  a  small,  dense  set  of  integers.  With  some 
effort,  these  branches  can  usually  be  Implemented  with  a  few  "com¬ 
puted  go- to"  instructions,  which  are  considerably  faster  than  a 
sequence  of  tests  and  branches. 

IMF  operators  may  appear  in  a  number  of  different  contexts,  which 
strongly  affect  the  code  sequence  that  must  be  generated  for  them.  The 
code  generator  recognizes  five  such  contexts  internally: 

.  Reach.  In  this  case,  the  operator  is  being  used  to  return  he 
address  of  an  object  in  memory,  if  possible,  and  a  value  in  a 
register  otherwise.  This  context  is  of  particular  use  in  evaluating 
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the  left-hand-sides  of  assignments  and  the  operands  of  arithmetic 
operators.  Generally,  use  of  *-he  "reach"  context  implies  the 
generation  of  a  "memory  reference"  instruction  in  the  finai  coo 
sequence. 

.  Load.  In  this  context,  the  operator  is  being  used  to  return  a  value 
in  a  register.  Tnis  is  the  usual  context  for  obtaining  the  result 
of  an  arithmetic  operator. 

.  Flow.  An  operator  in  "flow"  context  yields  a  change  in  the  flow-of- 
control,  rather  than  a  value  in  a  register.  This  is  the  context  in 
which  loop  termination  expressions  are  evaluated,  for  example. 

.  Void.  Voided  operators  yield  side  effects  only.  This  is  the 
context  in  which  most  programming  language  statements  are  evaluated; 
for  instance,  an  assignment  statement  has  only  the  side  effects  of 
evaluating  both  the  left  and  right  hand  sides,  then  copying  the 
value  of  the  right  into  the  object  on  the  left.  In  "load"  context, 
the  same  assignment  would  also  yield  the  value  of  the  right  hand 
side. 

.  Argument  Pointer  (often  abbreviated  "AP").  AP  is  the  context  in 
which  actual  parameters  of  procedures  are  evaluated.  In  such  a 
context,  all  operators  must  yield  the  address  of  an  object  in 
memory,  even  if  it  is  necessary  to  allocate  a  section  of  memory  and 
copy  the  result  of  the  operator  into  it. 

Within  a  given  context,  an  operator  will  be  translated  according  to 
one  or  more  "cases",  usually  depending  on  the  accessability  of  its 
operands.  For  example,  the  "subtract"  operator  has  three  cases,  depending 
on  whether  its  right  operand  may  be  addressed  directly,  its  left  operand 
may  be  addressed  directly,  or  neither  operand  may  be  addressed  directly 
(e.g.,  both  are  expressions  yielding  values  in  registers). 

Finally,  within  a  given  case  in  a  given  context,  final  instruction 
sequences  may  be  devised  for  each  type  of  data  that  a  given  operator  may 
use.  Separate  registers  and  different  instructions  a;e  required  for  the 
manipulation  of,  say,  integer  and  floating  point  data. 

As  an  example,  the  final  list  of  code  sequences  for  "load"  context 
amounted  to  6000  lines  of  text.  On  the  average,  there  were  three  cases  for 
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eaci  operator,  and  five  subcases  for  each  applicable  data  type.  The 
majority  of  subcases  for  the  arithmetic  operators  were  quite  similar  to  one 
another  in  form. 

Code  sequences  for  the  other  contexts  were  derived  directly  from  the 
"load”  sequences.  In  virtually  all  cases,  it  was  necessary  only  to  delete 
some  code  from  the  "load"  sequence  or  append  an  instruction  or  two  to 
satisfy  the  requirements  for  other  contexts. 

3.3  JllQCUbtt  for 

The  overall  control  routine  for  the  code  generator  is  simple: 

for  each  input  module 
for  each  entry  point 

output  an  entry  point  declaration 
for  each  static  data  declaration 

reserve  link- frame  space  for  a  pointer 
output  an  "indirect  pointer"  to  the  external  object 
for  each  static  data  definition 

reserve  link-frame  space  for  the  object 
initialize  the  object's  value 
for  each  procedure 

reconstruct  the  IMF  tree 

walk  the  tree,  transforming  IMF  to  machine  code 
optimize  the  machine  code 

convert  machine  code  to  assembly  language  source 

The  heart  of  the  code  generator  i3  the  procedure  handling  algorithm.  In 
the  next  few  sections,  It  will  be  examined  in  detail. 

3.3.1  Icmi  ReconatrucUon 

The  result  of  syntactic  and  semantic  analysis  by  the  front  end  is  an 
intermediate  form  tree  for  each  procedure  in  the  source  program.  In  order 
to  transmit  a  tree  to  the  code  generator,  the  front  end  traverses  the  tree, 
directly  writing  the  values  of  IMF  operator  parameters  and  recursively 
writing  the  contents  of  subtrees.  The  result  is  a  copy  of  the  IMF  tree 
expressed  in  prefix  Polish  notation,  which  is  passed  through  the  com¬ 
munications  channel  to  the  code  generator. 
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Within  the  code  generator,  the  tree-building  routines  have  access  to 
a  table  contain!;  4  descriptions  of  each  IMF  operator:  its  size,  the  number 
and  types  of  its  operands,  etc.  The  first  step  is  to  read  an  integer  from 
the  input  strean;  this  gives  the  IMF  operator  that  appears  next  in  the 
input.  The  descriptor  table  is  then  accessed  using  the  operator  number  as 
a  key.  The  table  entry  gives  the  size  of  the  tree  node,  which  is  then 
allocated  in  tree  memory.  The  remainder  of  the  table  entry  describes  the 
operator's  parameters,  usually  strings,  integers,  or  subtrees.  Successive 
portions  of  the  table  entry  are  interpreted,  causing  one  read  from  the 
input  stream  for  each  integer  parameter,  several  reads  (one  for  each 
character)  for  each  string  parameter,  and  a  recursive  call  on  the  tree 
builder  for  each  subtree  parameter.  Values  returned  for  each  parameter  are 
placed  in  the  previously-allocated  node,  and  then  the  node  is  returned. 
The  final  result  is  a  duplicate  of  the  procedure  tree  built  by  the  front 
end. 


3.3.2  iME  iranafonnaUan 

The  code  sequence  to  be  emitted  for  an  operator  depends  on  the  context 
in  which  it  appears,  the  accessability  of  its  operands,  and  the  type  of 
data  being  manipulated.  Context  information  is  available  at  the  root  of 
the  tree  and  spreads  down  to  the  leaves.  Oparand  accessability  is  kncwn  at 
the  leaves  and  induced  from  the  leaves  to  the  root.  Data  type  information 
is  supplied  by  the  front  end  for  every  operator,  so  it  is  immediately 
available. 

Postorder  tree  traversal  i.s  an  efficient  algorithm  for  propagating 
accessability  ir.foraatiu.  from  the  leaves  up,  and  preorder  tree  traversal 
is  efficient  for  propagating  context  information  down  from  the  root.  A 
simple  combination  of  the  two  forms  the  framework  for  procedure  code 
generation. 

Internally,  generation  of  code  for  an  IMF  subtree  is  accomplished  by 
calling  one  of  the  routines  "reach,"  "load,"  "flow,"  or  "void"  (ap  context 
is  handled  as  a  special  case  within  "load").  This  causes  the  root  of  the 
subtree  to  be  visited.  Depending  on  the  inherited  context  and  the  operator 
at  the  root  of  the  tree,  one  of  the  generation  routines  is  recursively  cal¬ 
led  for  each  operand  of  the  root. 
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Each  of  the  code  generation  routines  returns  information  about  the 
subtree  it  Just  transformed:  the  location  of  the  result  (in  mamory  or  in  a 
register),  what  registers  were  used  to  obtain  the  result,  and  a  linked  list 
of  the  machine  instructions  generated  to  calculate  the  result  at  execution 
time.  All  of  this  information  comprises  the  operand  aeoessability  data 
needed  for  cjode  sequence  selection. 

Once  the  operands  have  beer  evaluated,  code  for  the  root  is 
generated.  The  code  sequences  for  the  operands  are  then  linked  with  the 
code  sequence  for  the  root.  Finally,  a  tally  is  made  of  the  registers  used 
and  the  entire  collection  of  information  is  returned. 

Appendix  B  illustrates  the  code  generation  process  for  a  small  sub¬ 
tree. 


3.3.3  QpUmiaUon 

Register  tracking  forms  the  basis  of  the  currently  implemented 
optimizer.  Within  the  code  generator,  each  register  is  associated  with  a 
state  variable  that  indicates  whether  the  register’s  contents  are  "known” 
or  "unknown."  If  the  contents  are  known,  the  register  is  also  associated 
with  an  "address  descriptor"  that  pinpoints  the  memo  y  locations  that  sup¬ 
plied  the  register's  contents. 

The  optimization  process  is  a  pass  over  the  linked  list  of  procedure 
code.  Whenever  possible,  general  purpose  instructions  are  replaced  with 
faster  special-purpose  instructions;  for  example,  "LDA  =0"  (load  register  A 
with  the  value  zero)  is  replaced  with  "CRA"  (clear  register  A).  After 
replacement,  the  effects  of  the  instructions  on  register  contents  are 
simulated.  Whenever  a  "load  register"  instruction  is  encountered,  and  the 
contents  of  the  register  to  be  loaded  will  not  be  changed  by  the  instruc¬ 
tion,  the  instruction  is  eliminated.  If  a  register  must  be  loaded  with  a 
new  value,  and  that  value  is  known  to  reside  in  another  register,  the  load 
instruction  is  replaced  with  a  register-to-register  transfer. 

The  most  difficult  part  of  the  optimization  process  is  the  instruc¬ 
tion  simulation.  In  general,  a  "lead"  Instruction  causes  a  register's 
state  to  become  known  and  its  contents  equal  to  those  of  a  particular 
memory  location.  Most  instructions  (e.g.  arithmetic  operations)  cause  one 
or  more  registers'  contents  to  became  unknown,  "Store"  Instructions  may 


Georgia  Institute  of  Technology 


Re-Usable  Code  Generator 


rage  18 


Implementation  Overview 


Chapter  ? 


alter  arbitrary  locations  in  memory,  thus  invalidating  a  register/memory 
equivalence;  the  exact  effects  are  dependent  on  the  particular  "store" 
instruction  used.  Memory  "aliasing"  is  a  particularly  nasty  problem;  the 
optimizer  takes  a  highly  conservative  approach  and  after  any  store  destroys 
equivalences  baaed  on  indirect  or  Indexed  addresses. 


3.M 


The  code  generator  is  written  in  Ratfor,  a  Fortran  preprocessor 
language  described  in  [Kernlghan  19761.  It  is  approximately  12,000  lines 
in  length,  of  which  nearly  7,700  (64%)  are  devoted  to  the  selection  of  code 
sequences  and  700  (6%)  to  optimization.  The  remainder  is  devoted  primarily 
to  Input/output,  storage  management,  and  simulation  of  heterogeneous  data 
structures  with  Fortran  arrays. 

Although  large,  the  code  generator  is  relatively  easy  to  manage, 
since  most  of  the  Ratfor  code  deals  with  independent  case  analyses.  This 
is  reflected  in  the  subprogram  call  tree,  reproduced  in  part  below: 

module 

initialize 
generate^en tries 
generatcustatiqjatuff 
generate_procedures 
reach 

reach_assign 

...  (approximately  6  other  routines) 
reachjseq 
load 
load 

loadjaddaa 
load,  reach 

loecLif 

flew,  load,  reach 

...  (approximately  60  other  routines) 
loaOcor 

load,  reach 

flew 

flow_eq 

load,  reach 

...  (approximately  12  other  routines) 
flow_switch 

load,  reach,  void 
load 
void 

voicL.addaa 
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load,  reach 

...  (approximately  4  other  routines) 
voidjseq 

load,  reach 
load 
optimize 
put^inatr 

The  call  tree  la  broad,  but  not  particularly  deep.  The  bulk  of  the  code  is 
in  the  deacendanta  of  ‘’load,”  and  theae  routines  rarely  interaot  with  one 
another. 

On  the  average,  a  "load"  routine  Is  about  77  lines  of  code. 
Although  thia  la  larger  than  optimal,  moat  "load”  routines  are  eaally  com¬ 
prehended,  since  they  are  straightforward  case  analyses.  No  attempt  was 
made  to  eliminate  duplict  ed  code.  A  sample  code  generation  routine  is 
presented  in  Appendix  D. 
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CHAPTER  A 
Experience 

The  code  generator  has  been  used  to  implement  two  compilers:  a  full-scale 
compiler  for  the  language  C  and  a  demonstration  compiler  fcr  a  small  teach¬ 
ing  language.  The  C  compiler  rins  almost  twice  as  fast  as  Prime's  Fortran 
77  and  Pascal  compilers  (700  lines  per  minute  vs.  400  lines/minute,  on  a 
Prime  550).  It  is  also  somewhat  smaller  (in  terms  of  code  size)  than 
Fortran  77  or  Pascal  (2  segments  vs.  4  and  3  segments,  respectively). 
Hand  inspections  and  informal  benchmarks  indloate  that  the  code  produced  is 
generally  superior  to  that  produced  by  Pascal,  PL/I,  and  Fortran  77;  in 
particular,  fewer  base  register  loads  are  generated,  and  operations  on  pac¬ 
ked  data  structures  are  performed  without  resorting  to  the  field  manipula¬ 
tion  instructions. 

Examination  of  the  code  generator's  output  indicates  a  few  areas 
that  need  improvement,  though.  The  most  obvious  is  register  tracking 
across  basic  block  boundaries,  particularly  in  loops.  Truly  excellent  code 
can  be  produced  whenever  an  arithmetic  loop  control  variable  can  be  pushed 
into  an  index  register,  but  present  optimization  forces  stores  and  loads  at 
the  boundaries  of  the  basic  block  containing  the  loop  body. 

The  intermediate  form  could  stand  a  few  modifications.  For  example, 
there  is  no  way  to  specify  that  o  array  is  65,536  words  long.  This  is  no 
great  problem  at  the  moment,  but  should  be  fixed  in  the  future.  As  another 
example,  comparison  of  structure  or  array  operands  requires  information  on 
the  length  of  the  operands.  There  is  presently  no  space  reserved  in  the 
IMF  comparison  operators  for  this  information. 

Several  informal  measurements  of  code  generator  performance  have 
been  made.  Initially,  almost  50%  of  code  generation  execution  time  was 
devoted  to  reading  the  ASCII  textual  input.  Special-casing  the  character- 
to-binsry  conversion  and  eliminating  some  logical  redundancy  within  the 
input  routine  reduced  execution  time  by  30%.  Presumably,  elimination  of 
the  charac ter- to-bi nary  conversion  would  speed  up  execution  even  more. 
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CHAPTER  5 
Conclusions 

The  separation  of  lexical/syntactic/semantic  analysis  from  code  generation 
and  the  development  of  a  standard  "intermediate  form"  allows  many  compilers 
to  use  the  same  code  generator. 

Use  of  the  code  generator  significantly  reduces  the  amount  of  effort 
required  to  implement  compilers  on  Prime  computers. 

The  case-analysis  approach  to  code  generation  is  effective,  at  least 
when  compared  to  the  algorithms  used  in  existing  compilers  for  Prime  com¬ 
puters.  As  a  consequence,  however,  the  code  generator  is  a  very  large 
piece  of  software,  with  a  great  nunber  of  almost-identical  runs  of  code. 

The  code  generator's  effectiveness  has  been  demonstrated  for  a  C 
front  end,  but  it  seems  likely  some  additions  must  be  made  to  make  it 
equally  effective  for  Pascal  and  other  languages. 


Georgia  Institute  of  Technology 


Re-Usable  Code  Generator 


Page  22 


RecOTriendatioris 


Chapter  6 


CHAPTER  6 
Recommendations 

There  are  several  areas  in  which  the  code  generator  might  be  improved. 

Many  of  the  special  cases  that  are  currently  handled  by  open  code 
essentially  involve  emitting  special  instructions  when  an  operand  has  a 
particular  value.  Clearly  these  could  be  encoded  in  a  table,  with 
consequent  reduction  in  code  generator  size  and  complexity  (although  pos¬ 
sibly  increasing  run  time,  as  well). 

Data  packing  is  not  treated  properly  in  the  current  implementation. 
The  only  operator  that  explicitly  deals  with  packed  data  is  the  FIELD 
operator,  and  it  must  be  inserted  in  the  proper  places  by  the  front  end. 
The  definition  of  FIELD  is  machine-dependent  in  the  extreme.  A  better 
approach  would  be  to  generalize  the  concept  of  "address  descriptor"  used 
throughout  the  code  generator,  allowing  any  operator  to  take  packed 
operands  directly.  A  few  IMF  operators  (INDEX  and  SELECT,  especially) 
would  need  to  be  extended  to  take  full  advantage  of  the  added  generality. 
There  are  several  special  cases  (for  instance  comparison  of  fields  to 
constants)  which  should  be  exploited. 

In  the  intermediate  form,  data  types  are  restricted  to  integer, 
unsigned,  long  integer,  long  unsigned,  single  precision  floating,  double 
precision  floating,  and  stowed  (structures  and  arrays).  Machine  indepen¬ 
dence  could  be  improved  by  using  precision  and  range  specifications  like 
those  available  in  Ada. 

The  code  generator  is  written  in  Ratfor,  a  FORTRAN  preprocessor 
language.  This  has  the  advantage  of  considerable  support  from  the  Software 
Tools  Subsystem  running  on  the  ICS  Prime  computers,  and  it  makes  use  of  the 
fast  FORTRAN  66  compiler.  However,  the  lack  of  pointers  and  heterogeneous 
data  structures  in  FORTRAN  makes  the  code  slower  and  more  obtuse  than  it 
needs  to  be.  If  possible,  the  code  generator  should  be  re-written  in  a 
more  reasonable  language.  C  would  be  a  good  candidate;  Pascal  would  also 
be  a  good  choice  if  a  better  compiler  implementation  becomes  available. 

One  of  the  important  features  of  block  structured  languages  that  the 
code  generator  does  not  directly  support  is  that  of  nested  scopes.  This 
feature  requires  a  "display"  of  pointers  to  currently-active  stack  frames. 
Although  the  display  can  be  conveniently  fabricated  with  existing  IMF 
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operators,  the  present  storage  allocation  algorithm  and  forward  reference 
resolution  techniques  are  inadequate.  For  example,  to  process  a  procedure 
B  nested  in  a  procedure  A,  the  offsets  of  A's  variables  in  its  stack  frame 
must  be  known.  This  will  not  be  the  case  unless  code  for  A  has  been 
generated.  Unfortunately,  this  cannot  be  done  tnless  the  code  for  B  has 
been  queued  somewhere,  since  the  code  for  B  precedes  the  code  for  A.  Thus 
there  is  a  circular  chain  of  dependencies.  One  possible  solution  would  be 
for  the  front  end  to  allocate  space  for  A's  variables,  then  provide  the 
offsets  for  use  by  B  and  let  the  code  generator  handle  relocation  when  it 
actually  generates  code  for  A.  This  can  be  done  at  present  by  treating  all 
local  variables  as  members  of  structures,  but  this  imposes  an  unacceptable 
amount  of  machine-dependence  on  the  front  end. 

The  present  optimization  algorithm  is  not  adequate.  Global  propaga¬ 
tion  of  register  state  information  would  be  very  valuable,  particularly  in 
arithmetic  loops.  The  register  tracking  scheme  now  equivalences  a  register 
and  one  location  in  memory;  a  better  approach  would  be  to  build 
"equivalence  classes"  containing  all  registers  and  memory  locations  known 
to  have  the  same  value,  providing  more  opportunities  to  eliminate  load 
instructions  and  perhaps  providing  enough  information  to  hoist  code  from 
loops. 

The  321  mode  architecture  available  on  the  Prime  550  and  higher- 
nunbered  models  is  a  multi-register  architecture  differing  somewhat  from 
6MV  mode.  It  would  be  interesting  to  see  if  the  ideas  used  in  this  code 
generator  could  be  applied  to  a  321  mode  code  generator,  or  if  the  same 
intermediate  form  would  be  useful. 

The  arrival  of  a  VAX  11/780  at  ICS  within  a  year  poses  a  similar 
question,  since  the  VAX  is  a  general  register  machine.  Could  a  code 
generator  be  devised  for  the  VAX,  allowing  cross-compilation  from  VAX  to 
Prime  and  vice  versa?  Would  the  intermediate  form  prove  portable  enough  to 
permit  retargeting  and  transport  of  compilers?  Since  the  VAX  and  the  Prime 
550  are  both  virtual  memory  machines  with  a  natural  word  width  of  16  bits, 
and  there  are  very  few  other  explicit  machine  dependencies  in  the 
intermediate  form,  it  seems  likely  that  an  attempt  to  Implement  a  retar- 
getable  compiler  would  be  successful. 
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The  following  list  enumerates  the  primitive  data  types  supported  by  the 
intermediate  form.  For  a  more  complete  description,  see  [Akin  1981], 


INTJCDE 

16-bit  signed  integer 

lonq__intj©de 

32-bit  signed  integer 

UNIMODE 

16-bit  unsigned  integer 

LONGJJNSJODE 

32-bit  uns^ned  integer 

FLOAT.  JODE 

32-bit  floating  point 

lonclflqat^mode 

64-bit  floating  point 

STCWED_MODE 

structure  or  array  data 

The  following  list  completely  enumerates  the  intermediate 

operators.  For  complete  descriptions,  see  [Akin  19813. 

ADPAiUOP 

add,  assign  result  to  left  operand 

ADDjOP 

add 

ANDAA^OP 

logical  and,  assign  result  to  left 

AND^OP 

logical  and 

ASSIGIL.OP 

copy  value 

BREAKjOP 

break  out  of  a  loop  or  case 

CASEL.OP 

case  alternative  in  a  switch 

COMPl^OP 

one 1  s-camp].  ement 

CONSTjOP 

defiiie  constant 

CONVERT^OP 

convert  data  modes 

DECLAREJSTATjQP 

declare  an  external  static  object 

DEFAULTS 

default  alternative  in  a  switch 

DEFINfiLcDY  NMJDP 

define  a  dynamic  local  object 

DEFINE^STATJDP 

define  a  static  local  or  global  object 

DEREFJOP 

dereference  a  pointer 

DIVAAJOP 

divide,  assign  result  to  left  operand 

DIVJDP 

divide 

DQJLOOPjOP 

test-at-the-bottom  loop 

ECLOP 

test  for  equality 

F0RJ.00PJDP 

generalized  loop 

GELOP 

test  for  greater-or-equal 
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GOTCLOP 

GTJ3P 

IF^OP 

INDEXjOP 

INITIALIZEIL.OP 

LABELvjOP 

LEL.OP 

LSHIFTA/L.OP 

LSHIFTLOP 

LTVOP 

HODULELOP 

MULALPP 

MUUOP 

NECLOP 

NEXTjOP 

NELOP 

NOT^OP 

NULUjOP 

OBJECTL.OP 

ORAAjOP 

OILOP 

POSTDEC^OP 

POSTINCJ3P 

PREDEC^OP 

PREINQjOP 

PROCLCAU^ARCLOP 

PROC^CALEJDP 

PROCLDEFfL.ARCL.OP 

PROCJ)EFfLOP 

REFTQJDP 

REMAL.OP 

REPLOP 

RETURfLOP 

RSHIFTAA^OP 

RSHIFTJOP 


juno  to  label 

te  *  greater-than 

conditional  statement/expression 

select  an  array  element 

initial  value  of  an  object 

target  cf  a  junp 

test  for  less-than-or-equal-to 

shift  left,  assign  result  to  left 

shift  left 

test  for  less-than 

beginning  of  input  module 

multiply,  assign  result  to  left 

multiply 

two ' s-complement 

force  next  loop  iteration 

test  for  inequality 

logical  negation 

null 

reference  a  variable 

logical  or,  assign  result  to  left 

logical  or 

C  postdecrement 

C  postincrement 

C  predecrement 

C  preincrement 

procedure  call  argument 

procedure  call 

procedure  formal  parameter 

procedure  definition 

generate  reference  to  object 

remainder,  assign  result  to  left 

remainder 

return  from  procedure 

right  shift,  assign  to  left  operand 

right  shift 
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SANDJDP 

sequential  (short-circuit)  and 

SELECTL.OP 

select  field  of  a  structure 

SEQJ3P 

left-to-right  sequence 

S0IL.0P 

sequential  (short-circuit)  or 

SUBA/L.OP 

subtract,  assign  result  to  left 

SUBjOP 

subtract 

SWITCFLOP 

multiway  branch 

UNDEFINELDYNMjDP 

undefine  local  dynamic  object 

whilelloop^op 

test-at-the-top  loop 

XORAA^OP 

exclusive-or,  assign  to  left  operand 

XOR_OP 

exclusive-or 

ZERO^INITIALIZERjOP 

initialize  object  to  zero 

FIELDjOP 

extract  bit  field  from  a  word 

CHECie.RANGEL.OP 

check  within  range 

CHECK..UPPEIL.OP 

check  less  than  upper  bound 

CHEOLLCWEILOP 

check  greater  than  lower  bound 
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APPENDIX  B 
IMF  Transformation 

The  following  tables  are  excerpted  from  the  case  analyses  of  the  IMF 
operators  ADDJDP  and  SUBjOP.  To  generate  codo,  the  cases  are  examined 
left-to-right.  The  phrase  "A  not  in  right  regs"  may  be  translated  into 
English  as  "Register  A  is  not  used  during  the  evaluation  of  the  right 
operand."  Note  that  these  operators  are  members  of  the  sane  operator  class 
(reversible  dyadic  one-register)  and  have  very  similar  code  sequences. 

REACH  CONTEXT/ ADD^OP/ INTEGER 


A  not  in  right  regs 

i  A  not  in  left  regs 

1 

A  in  both  regs 

load  left 

1  load  right 

load  right 

reach  right 

i  reach  left 

allocate  temp 

ADD  right 

1  ADD  left 

STA  temp 

1 

load  left 

1 

ADD  temp 

! 

deallocate  temp 

REACH  CONTEXT/ SUBJOP/ INTEGER 


A  not  in  right  regs  i  A  not  in  left  regs  |  A  in  both  regs 


load  left 
reach  right 
SUB  right 


load  right 
reach  left 
SUB  left 
TCA 


|  load  right 
I  allocate  temp 
I  STA  temp 
j  load  left 
|  SUB  temp 
|  deallocate  temp 


Consider  the  generation  of  code  for  the  following  program  fragment: 


integer  a,  b,  c; 
begin 

...a  —  (b  +  c)««. 
end 

The  following  intermediate  form  code  would  be  generated  by  the  front  end: 

62  SUBJOP 

1  INTJODE 

40  OBJECTJOP 

1  INTLMODE 

...  object  id  for  'a* 

2  ADDjOP 

1  INT^MDDE 
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40 

OBJECTjOP 

1 

INTLMODE 

it* 

object  id  for  *  b * 

40 

OBJECTjOP 

1 

INTJODE 

ii* 

object  id  for  'c' 

The  code  generation  process  for  this  subtree  might  be  traced 

as  follows: 

Control  enters  through  "reach"  at  the  subtree  rooted  with  SUIL.OP. 
Following  the  definition  of  SUQjOP  that  is  available  internally,  "reach" 
invokes  itself  recursively  to  evaluate  the  left  operand. 

The  left  operand  is  a  simple  object,  which  is  reached  without 
difficulty.  "Reach"  returns  an  address  descriptor  for  the  object  (say, 
"SB*+20"),  a  null  set  of  registers  (none  were  used),  and  a  null  list  of 
code  (none  was  generated). 

"Reach"  invokes  itself  recursively  to  evaluate  %he  right  operand  of 
the  SUBJ3P. 

The  right  operand  is  ar.  ADDjOP.  "Reach"  invokes  itself  recursively 
to  evaluate  the  left  operand  of  the  ADD. 

The  left  operand  is  a  simple  object.  "Reach"  returns  an  address 
descriptor  (say,  "SBJ+30"),  a  null  set  of  registers,  and  a  null  code  list. 

"Reach"  invokes  itself  recursively  to  evaluate  the  right  operand  of 
the  ADD. 

The  operand  is  a  simple  object,  "Reach"  returns  an  address  descrip¬ 
tor  (say  "SBX+40"),  a  null  set  of  registers,  and  a  null  code  list. 

Control  returns  to  the  instantiation  of  "reach"  at  the  ADD^OP.  The 
case  analysis  for  ADD  is  consulted;  the  first  case  applies.  "Reach" 
returns  the  result  "in  register,"  a  set  of  registers  containing  register  A, 
and  the  code  list  "LDA  SB*+30;  ADD  SB*+40." 

Control  returns  to  the  instantiation  of  "reach"  at  the  SUBjDP.  The 
case  analysis  for  SUB  is  consulted;  since  the  right  operand  used  the 
register  A,  the  second  case  applies.  "Reach"  raturns  the  result  "in 
register,"  a  set  of  registers  containing  only  register  A,  and  the  code  list 
"LDA  S3X+30;  ADD  SB* +40;  SUB  SB* +20;  TCA." 

At  this  point,  good  code  for  the  entire  subtree  has  been  generated. 
It  may  stand  alone  or  be  used  by  some  other  code  tree  of  which  this  subtree 
was  a  part. 
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APPENDIX  C 

User-Oriented  Docuaentatlon 

[Akin  1981]  is  a  compendium  of  information  pertaining  to  the  use  of  the 
code  generator,  rather  than  its  internal  structure.  Its  size  (121  pp. 
single-spaced)  precludes  its  inclusion  here.  The  following  paragraphs 
describing  the  contents  of  the  Userls  Guide  are  excerpted  from  it. 

The  first  chapter  of  this  Guide  is  the  Overview-  The 
Q«er« lew  is  a  brief  summary  of  the  design  and  construction  of 
the  code  generator.  This  chapter  may  be  of  general  interest, 
but  it  is  not  necessary  to  read  it  in  order  to  learn  to  use  the 
code  generator. 

The  Code  Generator  n*»ge  chapter  describes  the  location 
of  the  code  generator  and  its  associated  run-time  support 
libraries,  as  well  as  the  Software  Tools  Subsystem  coomands 
necessary  to  access  them.  Recommended  procedure  is  to  study 
this  section,  then  generate  coranand  language  programs  to  do  the 
low-level  file  access  operations. 

Input  Data  Stream  Formats  gives  a  bird's-eye  view  of  the 
formats  of  the  three  code  generator  input  streams.  This  chap¬ 
ter  merits  some  study,  although  it  is  supplemented  by  the 
Extended  Examples. 

The  three  operator  definitions  chapters  (Operators 
Useful  in  the  Static  Data  Stream T  Operators  Useful  In  the 
Procedure  Definition  Stream T  QgecafcPCS  lldCEUl  In  PCQCfidUCe 
Definitions)  provide  a  detailed  reference  for  the  intermediate 
form  operators  interpreted  by  the  code  generator.  One  or  two 
readings  through  this  chapter  are  desirable;  thereafter,  it  can 
be  used  as  a  reference  with  the  Qperator/Eunction  Index  and  the 
Table  of  Contents  used  as  entry  points. 

The  Extended  Examples  are  comprised  of  several  short 
(out  complete)  programs  written  in  the  language  C.  These  exam¬ 
ples  include  the  original  C  code,  annotated  versions  of  the 
three  code  generator  input  streams,  and  an  annotated  listing  of 
the  code  generator's  assembly  language  output.  The  chapter 
should  be  useful  in  learning  how  the  various  intermediate  form 
operators  work  together,  and  may  be  used  as  a  reference  when 
building  a  new  front  end. 

'Drift'  is  a  very  small  expression-based  language  whose 
structure  closely  mimics  the  code  generator's  internal  world- 
model.  The  '-Drift'.  Cnmnller  is  a  complete,  working  compiler 
using  the  code  generator  as  a  back-end.  It  serves  as  an  exam¬ 
ple  of  one  way  to  construct  a  front-end  for  the  VCG. 

For  ease  of  reference,  all  the  intermediate  form 
operators  have  been  organized  by  subject  in  the  Intermediate 
Eonn  Operatoc/Eunctlon  Index.  Typically,  one  would  look  up 
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some  function  (e.g.,  "subscripting")  in  the  lade*,  find  the 
name  of  the  appropriate  intermediate  form  operator  (e.g., 
INDEJ^OP),  then  look  up  that  operator  in  the  table  of  contents 
to  find  its  complete  description. 
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APPENDIX  D 

Code  Generation  Routine 

The  following  subprogram  generates  code  for  addition  of  two  values  in  a 
"load"  context.  It  is  typical  of  the  descendants  of  the  code  generator 
routine  "load." 


#  loacLadd  —  load  value  of  sum  of  two  subexpressions 

ipointer  function  loacLadd  (expr,  regs) 
tpointer  expr 
regset  regs 

include  VCCLCOMCN  #  global  variables 
logical  safe 

regset  lregs,  rregs,  opreg 
ipointer  1,  r 

ipointer  seq,  Id,  at,  gerunr,  reach 

integer  Ires,  rres,  lad  (ADDILPESQJ5IZE), 
rad  (ADDILPESQJ5IZE) ,  opsize,  opins, 
tad  (ADDILPES(LSIZE) 

select  (Tmem  (expr  +1))  #  data  type 

when  (INTJODE,  UNIMODE)  { 
opreg  =  L.REG 
opsize  =  1 
opins  r  ADD-INS 
} 

when  (L0N(LINTLM0DE,  LONCLUNSJGDE)  { 
opreg  =  LteREG 
opsize  s  2 
opins  =  ADL.INS 
} 

when  (FLOATL.MODE)  { 
opreg  =  F^REG 
opsize  =  2 
opins  =  FAELINS 
} 

when  ( LONQjnLOATLHODE )  { 
opreg  =  LFJREG 
opsize  s  4 
opins  =  DFAELINS 
} 

else 

call  panic  ("ADQJDP  has  bad  data  mode  (*i)#n"p, 
Unem  (expr  +1)) 
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1  s  reach  (Tmem  (expr  ♦  2),  lregs,  Ires,  lad) 
r  s  reach  (Itneni  (expr  +  3)»  rregs,  rres,  rad) 


select 

when  (safe  (opreg,  rregs))  #  right  doesn’t  use  opreg 
loadjadd  =  seq  (1, 

Id  (opreg,  Ires,  lad), 

r, 

gerunr  (opins,  rad)) 

when  (safe  (opreg,  lregs))  #  left  doesn't  use  opreg 
loacLadd  =  seq  (r, 

Id  (opreg,  rres,  rad), 

1, 

gerunr  (opins,  lad)) 
else  {  # 

loadjadd  =  seq  (r,  Id  (opreg. 
call  allocutemp  (opsize,  tad) 
loadjadd  =  seq  (loacLadd, 
st  (opreg,  tad), 

1, 

Id  (opreg,  Ires,  lad), 
gerunr  (opins,  tad)) 
call  free^temp  (tad) 

} 


both  sides  use  opreg 
rres,  rad)) 


regs  =  or  (opreg,  or  (lregs,  rregs)) 

return 

end 
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